GTZAN Dataset - Music Genre Classification - Individual Project (Yinjie Liu/20211091)

The following is an instance of a README file that allows the others to understand this data structure, some properties and re-run data pre-processing, training, validation and evaluation.

Download data (and move into project folder): https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification

At first, you have to "pip install" all the required packages in requirements.txt file. I run my code on the Pycharm(community version), and I use the tensorflow-gpu package which means the whole process will be executed on the GPU (if GPU is not available, it will implement on the CPU). However running on the GPU it requires a lot of configuration work.

python3.6.0/ keras2.3.1/ tensorflow-gpu2.0.0/ CUDA10.0.0/ cuDNN7.6.0.64 for CUDA10.0

You have to strictly follow the correct version about five above package since there are bound relationship between them.

The CUDA installation package, cuDNN files and tensorflow whl file can be find in the directory of venv_python3.6 .

If you want to run code in the terminal of Pycharm:
You have to install CUDA first. After CUDA installatoion, you need to unzip cuDNN file and remove its file of each directory into corresponding directory of "NVIDIA GPU Computing Toolkit\CUDA\v10.0".
Execute nvcc -V on your terminal. If successful, the cuda version number will be returned.

If you want to run code in the jupyter notebook:
You have to use "pip install tensorflow_gpu-2.0.0-cp36-cp36m-win_amd64.whl",
"conda install cudatoolkit=10.0"
and "conda install cudnn"

Directory Structure

Comp47650_Yinjie_Liu_20211091
-checkpoints
   |
    ----> cnn1
    ----> fcnn1
    ----> fcnn2
- Data
    |
     ---> features_3_sec.csv
     ---> features_30_sec.csv
     ---> test.csv
     ---> train.csv
     ---> val.csv
     ---> cnn1_json
        |
        ===> data_10.json
     ---> genres_original
        |
        ===> blues
                .
                . 
                .
        ===> rock    
     ---> images_original
        |
        ===> blues
                .
                . 
                .
        ===> rock

- figs
    |
     ---> cnn1_training_vis.png
     ---> fcnn1_training_vis.png
     ---> fcnn2_training_vis.png   
     ---> PCA_Scattert.png   
- logs
    |
     ---> cnn1
        |
        ---> CNN1_040422_232546.json
     ---> fcnn1
        |
        ---> FCNN1_050422_202621.json
     ---> fcnn2
        |
        ---> FCNN2_050422_203047.json

- models
    |
    ---->__pycache__
        |
        ---> cnn1.cpython-36.pyc
        ---> fcnn1.cpython-36.pyc
        ---> fcnn2.cpython-36.pyc
    ----> cnn1.py
    ----> fcnn1.py
    ----> fcnn2.py
- utils
    |
    ---->__pycache__
        |
        ---> cnn1.cpython-36.pyc
        ---> fcnn1.cpython-36.pyc
        ---> fcnn2.cpython-36.pyc
    ---> Datasets.py
    ---> params.py
    ---> plotting.py
hparams.yaml
main.py
README.html
README.ipynb
requirements.txt

At the beginning of the work, I chose to analse this data in detail. Showing the structure of features_3_sec.csv file and there are 57 properties inside of this data. And then I picked one of them to show its Zoomed audio wave graph, Simple audio waveplot and Principal component analysis on Music Genres graph amongest ten different labels.

Data analysis

Let's see for each type music, how many rows it has.

Training & validation

The CNN model firstly came to my mind, and I searched lots of materials how to process audio files.
The librosa package can be used to extract the mcff attribute of a audio file.
And I saved them in the Data/cnn1_json/data_10.json file which includes each audio file mcff data, its label and each label's directory.

And I used keras to implement a CNN network to train, and its compile parameters can be found in the hparams.yaml file.

Train CNN model

You can run the "python main.py CNN1 --write_data True" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6)

Or you can directly run ! python main.py CNN1 --write_data True in the jupyter notebook

And then I found this model accuracy was not ideal enough, so I then directly to process features_3_sec data file and use fcnn(fully connected neural network) to train. And I split data into training data(70%), valid data(20%) and test data(10%).

Train FCNN models

You can run the "python main.py FCNN1 --write_data True" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6) And you will get result as below shown.

Or you can directly run ! python main.py FCNN1 --write_data True in the jupyter notebook

As we can see from above result, the accuracy reaches more than 90%, but I try to build another fcnn model which is more complicated and there will be dropout layer after each fully connected layer. Also I add the dimensionality in this model.

You can run the "python main.py FCNN2" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6) And you will get result as below shown.

Or you can directly run ! python main.py FCNN2 in the jupyter notebook

And I set the epochs as 500 and use rmsprop optimizer, and finally get more than 93% accuracy for this model.

Lets take a look at the validation performance for these three models

You can run the "python eval.py logs/cnn1/CNN1_060422_122834.json 5" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6) And you will get result as below shown.

Or you can directly run ! python eval.py logs/cnn1/CNN1_060422_122834.json 5 in the jupyter notebook

You can run the "python eval.py logs/fcnn1/FCNN1_060422_124733.json 5" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6) And you will get result as below shown.

Or you can directly run ! python eval.py logs/fcnn1/FCNN1_060422_124733.json 5 in the jupyter notebook

You can run the "python eval.py logs/fcnn2/FCNN2_060422_125313.json 5" in the terminal of Pycharm.(Warning: the right Virtual environment should be venv_python3.6) And you will get result as below shown.

Or you can directly run ! python eval.py logs/fcnn2/FCNN2_060422_125313.json 5 in the jupyter notebook